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[57] ABSTRACT 

A compiler method converts an indirect call to a callee 
routine in a caller routine program listing, to an in-line 
listing of, or a direct call to, the callee routine in the caller 
routine. An indirect call is a call to a callee routine wherein 
the callee routine is not absolutely identified until run time 
of the program listing. The method includes the steps of: 
comparing plural prospective callee routines in the program 
listing with characteristics of an indirect caller site in the 
caller routine and eliminating prospective callee routines 
which evidence other than a match with those characteris- 
tics; employing call statistics associated with remaining 
prospective callee routines (and the caller routine) to elimi- 
nate further ones of the prospective callee routines to arrive 
at a set of one or more chosen prospective callee routines. 
The method concludes by in-lining at the indirect caller site 
at least one of the set of chosen prospective callee routines. 
As an alternative, a direct call can be inserted. At nm time, 
the program listing is executed and, in the process of 
execution, the callee routine is absolutely identified. If the 
identified callee routine has already been in-lined (or there 
is a direct call present), it is executed and the program 
continues. If the identified callee routine is not present in the 
caller's code listing, via either an in-line listing or a direct 
call, an indirect call is executed to the identified callee 
routine. 

20 Claims, 4 Drawing Sheets 
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COMPILER FOR REDUCING NUMBER OF by a direct call. However it is assumed that prior to 

INDIRECT CALLS IN AN EXECUTABLE optimization, a profiling phase identifies a list of callee 

CODE candidates for each indirect call site by observing program 

behavior on a test input. Such indirect call information is 

HELD OF THE INVENTION 5 expensive to accumulate. 

ITiis invention relates to a source code compiler which In other prior art, a compiler has been described which 

converts a source listing to an executable object code and, performs a series of passes over a database that contains 

more particularly, to a source code compiler which reduces information about all of the procedures in an application. A 

a number of indirect calls in the executable object code and variety of analyses are performed to provide information as 

replaces them with either in-line listings or direct calls. to which procedures are invoked by a direct call; which 

names refer to a same location (alias analysis); which 

BACKGROUND OF THE INVENTION pointers point to which locations (pointer tracking); which 

A compiler accomplishes a translation of a source code procedures use which scalars (scalar analysis); and which 

listing to a set of object files that are written in machine procedures should be in-lined at which call sites (in-Une 

language. During the compilation action, code generation analysis); etc. Tlie results of the analyses are then employed 

and optimization decisions are made, and the resultant coded ^^""8 compile action to achieve application improve- 

output is then subjected to a linking action which primarUy "Engineering and Inter-Procedural Optimizing 

relocates code and data, resolves branch addresses and Compiler", LoeUger et al.. Convex Computer Corporation, 

provides binding to run-time Ubraries. Richardson, Tex. (undated). 

Many modem programming languages support the con- ^° ^^^1^ ^^e Loeliger et al. procedure performs many 

cept of separate compilation, wherein a source code listing analyses, there is no mdication of an attempt to identify, m 

is broken up into separate modules that can be fed individu- advance, callee procedures that are subject to an indirect 
ally to a language translator for generation of the machine 

code. The use of source code modules during a compilation Accordingly, it is an object of this invention to provide an 

process enables substantial savings in required memory in improved compiler which attempts to identify a callee that 

the computer on which the compiler executes. In a is the subject of indirect call. 

co-pending application entitled "Compiler with Intermodu- It is another object of this invention to provide an 

lar Procedure Optimization" (Attorney Docket 10961037-1) improved compiler which both identifies prospective callees 

Ser. No. 08/795,986 filed Feb. 5, 1997, and assigned to the of indirect calls and either in-lines the code of the identified 

same Assignee as this ^plication, a method is described for callees into caller routine listings or inserts a direct call 

improving the optimization of a source code listing which is thereto, 
compiled in a modular fashion. That method involves the 

derivation of a number of program- wide tables which enable SUMMARY OF THE INVENTION 

inter-modular referencing to occur, even though individual A compiler method converts an indirect call to a callee 

modules are, in the main, processed individually during routine in a caller routine program listing, to an in-line 

compilation. A principal use of the invention in the aforesaid listing of, or a direct call to, the callee routine in the caller 

patent application is to enable insertion of in-line code routine. An indirect call is a call to a callee routine wherein 

listings in place of direct call sites in the individual modules the callee routine is not absolutely identified until run time 

being optimized. A direct call is one wherein a routine is ^ of the program listing. The method includes the steps of: 

specifically noted in the routine by a name which enables a comparing plural prospective callee routines in the program 

direct reference to the called routine, wherever it is stored. listing with characteristics of an indirect caller site in the 

Such code listings also include indirect calls. An indirect caller routine and eliminating prospective callee routines 

call is a reference to a subroutine (i.e., the callee) wherein which evidence other than a match with those characteris- 

Ihe subroutine is not identified until program run time, 45 tics; employing call statistics associated with remaining 

Indirect calls are present in many of today's programming prospective callee routines (and the caller routine) to elimi- 

languages (e.g., C, Fortran, etc.) and also play a significant nate further ones of the prospective callee routines to arrive 

role in object-oriented programming languages like C++ and at a set of one or more chosen prospective callee routines, 

Java. Indirect calls, by their very nature require considerable The method concludes by in-lining at the indirect caller site 

processing and procedure delay time for their execution. If 50 at least one of the set of chosen prospective callee routines, 

it were possible to identify, in advance, the callee of an As an alternative, a direct call can be inserted. At run time, 

indirect call, the code comprising the callee routine could be the program listing is executed and, in the process of 

inserted into the caller's routine by in-lining or a direct call execution, the callee routine is absolutely identified. If the 

could be inserted to the identified code (a direct call requir- identified callee routine has already been in-lined (or there 

ing less processing than an indirect call). 55 ^ ^ direct call present), it is executed and the program 

As above indicated, in-lining replaces a call site in the continues. If the identified callee routine is not present in the 

caller routine with the callee routine's code. In-line substi- caller's code listing, via either an in-line listing or a direct 

tution eliminates call overhead and tailors the call to the call, an indirect call is executed to the identified callee 

particular set of arguments passed at a given caUer site. routine. 

Nevertheless, since in the prior art the identification of a 60 gRIEF DESCRIPTION OF THE DRAWINGS 

callee subject to an indirect call has not been known until run 

time, such indirect calls have remained in the compiled code FIG. 1 is a block diagram of a system for carrying out of 

and have resulted in increases in processing time. the invention hereof. 

In "Reducing Indirect Function Call Overhead in C++ FIG. 2 is a schematic diagram of a global call graph 

Programs" Calder et al., ACM Principles of Programming 65 utilized in the performance of the invention. 

Languages, Portland, Oreg., 1994, a technique is described FIGS. 3-5 illustrate a logical flow diagram describing the 

for replacing an indirect call with a matching test, followed operation of the invention. 
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DETAILED DESCRIPTION OF THE 
INVENTION 

Referring to FIG. 1, a computer 10 includes a central 
processing unit (CPU) 12 which is coupled via a bus system 
14 to a random access memory (RAM) 16, a disk drive 18 
and a read-only memory (ROM) 20, A memory cartridge 22 
is employed to insert a source listing into computer 10 and, 
further, may also be used to insert a compiler routine which 
incorporates the invention hereof. 

RAM 16, as an example, provides temporary storage for 
a plurality code listings that are utilized during the operation 
of the invention. Source listing 24 comprises a set of files 
including a plurality of routines to be run in the course of 
execution of the program defined by source listing 24. A 
compiler 26 is employed to convert source listing 24 into 
machine executable object code 28 (that is further stored in 
RAM 16). Compiler 26 includes a translator module 28 
which converts source listing 24 into intermediate represen- 
tation (IR) object code. The IR object code is then fed to an 
optimizer module 30 which performs a number of optimiz- 
ing actions to improve the performance of the overall 
program. Among the subroutines present in optimizer mod- 
ule 30 is an indirect call transform procedure 32, an in-lining 
procedure 34 and a cloning procedure 36. Lastly, a linker 
procedure 38 enables a Unking of the optimized object code 
modules and outputs executable object code 28 (which is 
stored in RAM 16). 

As above indicated, in-lining replaces a caller's call site 
with the actual code of the callee's routine. In-line substi- 
tution serves at least two pxuposes: it eliminates call over- 
head and tailors the call to the particular set of arguments 
passed at a given call site. However, unless the callee is 
known, in -lining of the caUee's routine cannot be performed. 
If a direct call is present in a caller's code listing, in-lining 
is possible. An example of a direct call is illustrated imme- 
diately below as follows (written in C): 



Direct OH 

AO 
{... 

B();... 

} 



Note that the direct call is present in caller routine A( ) and 
specifically names callee routine B( ). Thus, given a direct 
call to B( ), the listing for B( ) can be readily accessed and 
inserted bodily into the object code listing for caller routine 
A(). 

Shown below is an example of an indirect call: 



Indirect Call 

A() 
{... 

(-X)()... 

} 

where: X is dependent on the results of another routine. 

Note that in the indirect call above, the callee routine (*X) 
is undefined as to name and the identity of X only becomes 
apparent at the conclusion of execution of a preceding 
routine. 

As will be understood fi^om the description below, indirect 
call transform 32 identifies one or more prospective callee 
routines which can be expected to be the subject of an 
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indirect call. Once those routines are identified, they are 
preferably in-lined into the caller routine, with an indirect 
call still being retained in the code listing as a "last resort". 
As an alternative, a direct call may be inserted in lieu of the 
5 in-lining. 

If the identified prospective callee routine(s) i\xrn out not 
to match the actual callee routine that is output by a 
preceding routine, then the indirect call is executed, rather 
than the in-lined code (or direct call) from incorrectly 
identified routines. 

Turning to FIG. 3, the procedure of the invention will be 
described in conjunction with the fiow diagram shown 
therein. Initially, a non-optimized form of the object code is 
executed to obtain "profile" data (box 50). During the profile 
run, for each call present in the code listing, the following 
are determined: the number of times the call was made; and 
how often each called routine is run. Subsequent to the 
profile data run, the IR object code is subjected to indirect 
call transform procedure 32 which initially constructs a 
global call graph (box 52). An example of the a global call 
graph is shown, schematically, in FIG. 2. 

In FIG. 2, node Z represents a routine from which ten calls 
are output. Eight of those calls are direct calls, four to callee 
routine V; three to callee routine U and one to callee routine 
T. Each callee node is connected to the caller by an edge, 
which edge is then "decorated" with statistics indicating the 
number of calls represented by the edge (as determined 
during the profile run). Each edge is further associated with 
data obtained through the use of the profile run (box 50), 
such as the count of calls from a known caller to a known 
callee. Further, an indirect call node is established to rep- 
resent the two indirect calls from caller node Z to presently 
unknown, indirectly called nodes. Note that if node Z 
executes a total of ten calls during its execution and that the 

2^ total of caUs to identified callee nodes is known (i.e., 
4+3+l«8), then it is known there are two indirect caUs from 
node Z to presently unidentified callees. 

Referring back to box 52 in FIG. 3, the global call graph 
creates one node per routine and connects the caller and 
callee nodes by edges. The number of calls are noted on each 
edge and an indirect callee node is established to represent 
the unknown routines that are invoked by indirect calls from 
the caller node. 
The procedure now moves to a determination of prospec- 

45 tive callee nodes from indirect caller sites in the caller node. 
Each caller node is examined, in turn, to identify prospective 
callees and to eliminate those which are least likely to be 
called by the caller node. Two principal tests are employed 
to eliminate prospective callee nodes from further consid- 

50 eration: a "signature" match and a "profile" match. 

The procedure commences by identifying a first caller 
node which includes one or more indirect calls. For each call 
site in the caller routine, each prospective callee routine (i.e., 
"node") in the program listing is sequentially accessed and 

55 is analyzed to determine if the number and kind of argu- 
ments it requires match the number and kind of arguments 
for the indirect call site, as derived from inspection of the IR 
code (box 50, FIG. 3). If there is no match, the prospective 
callee routine is skipped and a next prospective callee 

60 routine is accessed and the test repeated. If a match occurs, 
the indirect caller site is examined to determine if a return 
value it expects matches that which will be returned by the 
prospective callee routine. Here again, if there is no match, 
the prospective callee routine is skipped and a next prospec- 

65 tive callee routine. 

Once all prospective callee routines have been subjected 
to the signature match test, a much smaller subset of 
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prospective callee routines remain. That smaller subset is description has emphasized the method of the invention, a 

then subjected to a profile match test which determines memory media (e.g., a diskette) with appropriate code can 

which of the remaining prospective callee routines evi- be caused to operate a computer to carry out the invention, 

dences one or more indirect calls thereto. The profile match Accordingly, the present invention is intended to embrace all 

lest (box 58) proceeds by determining the total number of 5 such alternatives, modifications and variances which fall 

calls (both direct and indirect) to the prospective callee node. within the scope of the appended claims. 

Next, the number of direct calls to the prospective callee We claim: 

routine which come from identified caller nodes are 1. A compiler method for converting, in a program listing, 

retrieved from the global call graph. an indirect call from a caller routine to a prospective callee 

It is then determined if the total number of calls to the 10 routine to an in-line listing of said prospective callee routine 

prospective callee node exceed the direct caller count in said caller routine or to a direct call to said prospective 

thereto. If yes, the excess calls are termed "surplus" and callee routine, an indirect call defined as a call to a callee 

indicate a potential for the prospective callee routine being routine wherein the callee routine is not identified until run 

the subject of one or more indirect calls. The number of of said program, said compiler method comprising the 

surplus calls is recorded and the node is passed to a ranking 15 steps of: 

step (box 60). If there are no surplus calls, the prospective a) comparing characteristics of plural prospective callee 

callee routine is rejected and a next prospective callee routines in said program listing with characteristics of 

routine is subjected to the profile match test. an indirect caller site in said caller routine and elimi- 

Tuming to FIG. 5, at the end of the profile match test, one nating prospective callee routines which evidence other 

or more prospective callee nodes have been identified which than a match therebetween; 

have passed both the signature match and profile match tests. b) employing call statistics associated with prospective 

In general, these tests will greatly reduce the number of callee routines and said caller routine, to eUminate ones 

prospective callee nodes to a small number The next step in of said prospective callee routines which have non- 

the procedure is to rank the remaining prospective callee matching call statistics; 

nodes in terms of their number of surplus calls. If the ^5 c) employing the results of steps a) and b), determining a 

remaining number of prospective caUee routines is greater chosen set of one or more prospective callee routines; 

than a predetermined number, a threshold comparison can be j 

utilized to reduce the number of prospective callee routines ... . , - j ,. c .i 

by accepting only those which evident a surplus of calls in ^"."'g l^**' f 0) " l"St one 

c.if.i. u ij 1 ic.u 1. r 30 of said chosen set of one or more prospective callee 

excess of the threshold value. If the number of prospective ^ .. . ,, , ., . , -. 

„ J ■ c .1. .1. u ij 1. . . n routmes, or (ii) a du-ect call for at least one of said 

callee nodes is few, the threshold can be set to 0. l . r n 

chosen set of one or more prospective callee routines. 

Thereafter, the procedure moves to box 62 wherem each indirect caller site 

of the identified and ranked prospective caUee nodes is 2. The compiler method as recited in claim 1, wherein said 

accessed to determme and identify each callee routme. The comparing step a) determines a presence or absence of a 

optimizer then modifies the IR code immediately before the signature match between each prospective callee routine and 

onginal indirect call to iiKert a matchmg test of the proce- j^jj^^t caller site, a signature match comprising at least 

dure to be called agamst the prospective callee nodes Each , comparison of number and kind of parameters passed by 

of the identified and ranked prospecuve caUee nodes is indirect caller site, to a number and kind of parameters 

accessed to obtam_their respective code hstings and each of ^mi^ej said prospective callee routine, an absence of a 

those routmes IS either "m-hned m the caller routine at the signature match eliminating a prospective callee routine 

call site or a direct call is inserted thereto. The decision fy^ther consideration. 

whether to in-fine code or insert a direct call is dependent 3 ^^^^^^ ^^^^od as recited in claim 2, wherein said 

upon the number of Umes the callee routme is executed. If comparing step a) also determines if said indirect caller site 

the caUee routine IS executed often. It is preferably m-Imed, ^^^^^^ ^^^^^^ ^ ^^^^ ^^^^ ^^^^^^ ^^^^^ 

and if mfrequently executed, a direct caU is used. routine and said prospective callee routine provides a return 

Importantly, the indirect call remains at the end of the value to a caller site and, if not, eliminating said prospective 

in-fining action so that if none of the prospective callee callee routine from further consideration, 

nodes is the actual one chosen by a preceding processing 4. The compiler method as recited in claim 1, wherein said 

action, then the indirect call can be implemented. employing step b) determines a number of indirect calls to 

At execution of the compiled code, when the indirect caU each prospective callee routine under consideration and 

site is reached, the name of the now-identified indirect callee eliminates any prospective callee routine which has no 

routine is compared to the names of the in-fined prospective indirect calls. 

callee routines to determine a match or no match state. If a 5. The compiler method as recited in claim 4, wherein said 

match exists, the matching in-lined code therefor is executed 55 employing step b) ranks prospective callee routines by a 

and, at the end, a branch action occurs to skip any following number of indirect calls determined for each thereof, 

non-executed code to a next step in the procedure after the 6. The compiler method as recited in claim 1, wherein said 

indirect call site. If no match is determined, the indirect call inserting step d) inserts each of said chosen set of one or 

is executed. more prospective caUec routines determined in step b) at 

An implementation of the above invention has shown that 60 said indirect caUer site by in-lining each in said caller 

a transformation designed to implement a guessing strategy routine. 

such as described above, in fact leads to a substantial 7. The compiler method as recited in claim 1, further 

performance improvement in compiled code. comprising the added step of: 

It should be understood that the foregoing description is e) retaining at said indirect caller site an indirect call, to 

only illustrative of the invention. Various allemafives and 65 be implemented in an event none of said chosen set of 

modifications can be devised by those skiUed in the art one or more prospective caUee routines matches an 

without departing from the invention. Thus, while the above indirect callee determined at run time. 
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8. The compiler method as recited in claim 7, further comparison of number and kind of parameters passed by 
comprising the steps of: said indirect caller site, to a number and kind of parameters 

£) executing said program listing and, upon identification ^^^^ said prospective callec routine, an absence of a 

of a callee routine that is a subject of an indirect call in signature match elimiriating a prospective caUee routine 

said program listing, comparing a name of the callee 5 from toher consideration, 

routine so identified with names of said chosen set of ^'^ ^^f'^ '"^f ^ 'i^^}^ ^' 

one or more prospective callee routines, and executing """"^ ») f^""" determines if said indirect caller site 

one thereof upon determining a match therebetween. '^'^^^'^ to receive a retam va^ue from an mdirect callee 

9. nie compiler method as recited in claim 8 where, if no '^'^ piospecUve caUee routme provides a return 
match is found between said name of the caUee routine 10 vdue to a caUer site and if not, eLmmates said prospective 
identified in step f) with names of said chosen set of one or ^'^'^ consideraUon. 

™«« „.«„^^»;r,- ^,oii-- ™.t;r,« ^^:a ;«^;«^t 14. The memory media as recited in claim 11, wherem 

more prospective callee routmes, execuimg said mdirect . v , ■ . r- j- . il 

^^jl ^ ° said means b) determines a number of mdirect calls to each 

^'lO. The compUer method as recited in claim 1, wherein P^^P^^^^^ bailee routine under consideration and elimi- 

said inserting step d) employs a determination of a number 15 nat^ any prospective callee routine which has no mdirect 

of times a code listing is executed to decide whether to insert ^^^J^ j i-^^ 

J ,. r *• II *• . ■ _* 15. The memory media as recited m claim 14, wherein 

a code hstmg of a prospective callee routine, or to insert a . / mwixiuy mv^^ia iv^nvu m ^Laim x-r, wuv^iv^m 

direct call to a prospective caUee routine. ^""^ ^\\'^ prospective caUee rouUnes by a number 

U.Amemory media for controlling a computer to execute of mdirect calls determined for each thereof. 

M *L J u- u * • « ^ on lo- The memory media as recited in claim 11, wherem 

a compiler method which converts, m a program listing, an -^^ . , . . r . . t i- 

indirect call from a caUer routine to a prospective callee ^aid means d) mserts each of said chosen set of one or more 

routine to an in-line listing of said prospective callee routine prospective caUee routines determined by means b) at said 

in said caller routine or to a direct call to said prospective "'d'rect caUer site by in-lmmg each m said caller routine, 

callee routine, an indirect caU defined as a call to a callee "> ^' ^"^^ 

routine wherein the callee routine is not identified until run 25 comprismg. 

time of said program, said memory media comprising: m"ns for controUmg said computer to retam at said 

, c . 11- -J . . u indirect caller site an indirect call, to be implemented in 

a) means for conlralling said computer to compare char- ^^^^^ ^^^^ ^^^^ ^, 

acteristics of plural prospective callee routines m said prospective callee routines matches an indirect callee 

program hsting with characteristics of an mdirect caller ot f™» 

° . , f. , J • -so determmed at run time, 

site in said caller routme and to eummate prospective -n. j- j- 

, . . . , . .1. . L 18. The memory media as recited in claim 17, further 

callee routmes which evidence other than a match . . 

comprismg: 

therebetween; c c * n- j « * * j 

f) means for controllmg said computer to execute said 

b) means for controlling said computer to employ call ^^^^^ jj^^-^g ^p^^ identification of a callee 
statistics associated with prospective callee routmes 35 ^^^^^ ^^^^ ^ ^^^^^^ indirect call in said 
and said caller routine, to eUminate ones of said pro- ^^^^^^ comparing a name of the callee routine 
spective callee routines which have non-matching call ^ identified with names of said chosen set of one or 
statistics; ^^^^ prospective callee routines, to execute one thereof 

c) means for controlling said computer to employ the upon determining a match therebetween. 

results of steps a) and b), in determining a chosen set 40 19. The memory media as recited in claim 18 where, if no 

of one or more prospective callee routines; and match is found between said name of the callee routine 

d) means for controlling said computer to insert at least identified by means f) with names of said chosen set of one 
one of: (i) a code listing for at least one of said chosen or more prospective callee routines, means f) causes execu- 
set of one or more prospective callee routines, or (ii) a tion of said indirect call. 

direct call for at least one of said chosen set of one or 45 20. The memory media as recited in claim 11, wherein 

more prospective callee routines, at said indirect caller said means d) employs a determination of a number of times 

site. a code listing is executed to decide whether to insert a code 

12. The memory media as recited in claim 11, wherein listing of a prospective callee routine, or to insert a direct call 

said means a) determines a presence or absence of a signa- to a prospective callee routine. 

ture match between each prospective callee routine and said 50 

indirect caller site, a signature match comprising at least a ***** 
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